Overview

Dataset statistics

Number of variables16
Number of observations11914
Missing cells3850
Missing cells (%)2.0%
Duplicate rows715
Duplicate rows (%)6.0%
Total size in memory6.8 MiB
Average record size in memory597.0 B

Variable types

CAT9
NUM7

Warnings

Dataset has 715 (6.0%) duplicate rows Duplicates
Model has a high cardinality: 915 distinct values High cardinality
Market Category has a high cardinality: 71 distinct values High cardinality
Market Category has 3742 (31.4%) missing values Missing

Reproduction

Analysis started2021-11-07 15:56:23.758881
Analysis finished2021-11-07 15:56:42.265116
Duration18.51 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Make
Categorical

Distinct48
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size93.2 KiB
Chevrolet
1123 
Ford
881 
Volkswagen
809 
Toyota
 
746
Dodge
 
626
Other values (43)
7729 
ValueCountFrequency (%) 
Chevrolet11239.4%
 
Ford8817.4%
 
Volkswagen8096.8%
 
Toyota7466.3%
 
Dodge6265.3%
 
Nissan5584.7%
 
GMC5154.3%
 
Honda4493.8%
 
Mazda4233.6%
 
Cadillac3973.3%
 
Mercedes-Benz3533.0%
 
Suzuki3512.9%
 
BMW3342.8%
 
Infiniti3302.8%
 
Audi3282.8%
 
Hyundai3032.5%
 
Volvo2812.4%
 
Subaru2562.1%
 
Acura2522.1%
 
Kia2311.9%
 
Mitsubishi2131.8%
 
Lexus2021.7%
 
Buick1961.6%
 
Chrysler1871.6%
 
Pontiac1861.6%
 
Other values (23)138411.6%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length13
Median length6
Mean length6.555648817
Min length3

Overview of Unicode Properties

Unique unicode characters47
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o71099.1%
 
e62548.0%
 
a61887.9%
 
i48836.3%
 
n41695.3%
 
d40535.2%
 
r39365.0%
 
l38995.0%
 
s36144.6%
 
t30333.9%
 
u28223.6%
 
C22222.8%
 
M20442.6%
 
h18092.3%
 
c17962.3%
 
v15472.0%
 
g14901.9%
 
y14421.8%
 
k13591.7%
 
z11271.4%
 
V10901.4%
 
F10121.3%
 
B9601.2%
 
T8261.1%
 
w8091.0%
 
Other values (22)861111.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter6296680.6%
 
Uppercase Letter1451318.6%
 
Dash Punctuation3840.5%
 
Space Separator2410.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C222215.3%
 
M204414.1%
 
V10907.5%
 
F10127.0%
 
B9606.6%
 
T8265.7%
 
S7815.4%
 
H7695.3%
 
A7405.1%
 
D6264.3%
 
L5954.1%
 
N5583.8%
 
G5183.6%
 
P4042.8%
 
I3922.7%
 
W3342.3%
 
K2311.6%
 
R2271.6%
 
O1501.0%
 
U170.1%
 
E170.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o710911.3%
 
e62549.9%
 
a61889.8%
 
i48837.8%
 
n41696.6%
 
d40536.4%
 
r39366.3%
 
l38996.2%
 
s36145.7%
 
t30334.8%
 
u28224.5%
 
h18092.9%
 
c17962.9%
 
v15472.5%
 
g14902.4%
 
y14422.3%
 
k13592.2%
 
z11271.8%
 
w8091.3%
 
b7981.3%
 
f3350.5%
 
m2890.5%
 
x2020.3%
 
p3< 0.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-384100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
241100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin7747999.2%
 
Common6250.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o71099.2%
 
e62548.1%
 
a61888.0%
 
i48836.3%
 
n41695.4%
 
d40535.2%
 
r39365.1%
 
l38995.0%
 
s36144.7%
 
t30333.9%
 
u28223.6%
 
C22222.9%
 
M20442.6%
 
h18092.3%
 
c17962.3%
 
v15472.0%
 
g14901.9%
 
y14421.9%
 
k13591.8%
 
z11271.5%
 
V10901.4%
 
F10121.3%
 
B9601.2%
 
T8261.1%
 
w8091.0%
 
Other values (20)798610.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
-38461.4%
 
24138.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII78104100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o71099.1%
 
e62548.0%
 
a61887.9%
 
i48836.3%
 
n41695.3%
 
d40535.2%
 
r39365.0%
 
l38995.0%
 
s36144.6%
 
t30333.9%
 
u28223.6%
 
C22222.8%
 
M20442.6%
 
h18092.3%
 
c17962.3%
 
v15472.0%
 
g14901.9%
 
y14421.8%
 
k13591.7%
 
z11271.4%
 
V10901.4%
 
F10121.3%
 
B9601.2%
 
T8261.1%
 
w8091.0%
 
Other values (22)861111.0%
 

Model
Categorical

HIGH CARDINALITY

Distinct915
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size93.2 KiB
Silverado 1500
 
156
Tundra
 
140
F-150
 
126
Sierra 1500
 
90
Beetle Convertible
 
89
Other values (910)
11313 
ValueCountFrequency (%) 
Silverado 15001561.3%
 
Tundra1401.2%
 
F-1501261.1%
 
Sierra 1500900.8%
 
Beetle Convertible890.7%
 
Tacoma800.7%
 
Frontier760.6%
 
GTI760.6%
 
Beetle750.6%
 
Accord750.6%
 
Civic680.6%
 
Dakota650.5%
 
Ram Pickup 1500620.5%
 
Jetta610.5%
 
Corvette600.5%
 
911580.5%
 
C/K 1500 Series560.5%
 
XC60540.5%
 
Colorado540.5%
 
3530.4%
 
370Z510.4%
 
Sienna510.4%
 
Golf GTI500.4%
 
Sierra 1500 Classic490.4%
 
Silverado 1500 Classic490.4%
 
Other values (890)1009084.7%
 
Frequencies of value counts

Unique

Unique40 ?
Unique (%)0.3%
Histogram of lengths of the category

Length

Max length35
Median length6
Mean length7.246852443
Min length1

Overview of Unicode Properties

Unique unicode characters66
Unique unicode categories6 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a70368.1%
 
e68117.9%
 
r61737.1%
 
o44495.2%
 
i40924.7%
 
38384.4%
 
n37724.4%
 
t35714.1%
 
028723.3%
 
s28533.3%
 
l28213.3%
 
S27693.2%
 
C25833.0%
 
u19602.3%
 
d16801.9%
 
c16071.9%
 
T15831.8%
 
514321.7%
 
p11811.4%
 
-11521.3%
 
R11301.3%
 
g10851.3%
 
v10831.3%
 
X10471.2%
 
110071.2%
 
Other values (41)1675219.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter5474763.4%
 
Uppercase Letter1846521.4%
 
Decimal Number80689.3%
 
Space Separator38384.4%
 
Dash Punctuation11521.3%
 
Other Punctuation690.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0287235.6%
 
5143217.7%
 
1100712.5%
 
37118.8%
 
64295.3%
 
44255.3%
 
73594.4%
 
93284.1%
 
23274.1%
 
81782.2%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3838100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S276915.0%
 
C258314.0%
 
T15838.6%
 
R11306.1%
 
X10475.7%
 
E9565.2%
 
G8944.8%
 
A8414.6%
 
M7664.1%
 
L7564.1%
 
V7003.8%
 
F6753.7%
 
B5553.0%
 
P5142.8%
 
I4802.6%
 
D3321.8%
 
H3251.8%
 
J2931.6%
 
W2391.3%
 
Q2201.2%
 
K2121.1%
 
Z1871.0%
 
N1861.0%
 
O1200.6%
 
Y750.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a703612.9%
 
e681112.4%
 
r617311.3%
 
o44498.1%
 
i40927.5%
 
n37726.9%
 
t35716.5%
 
s28535.2%
 
l28215.2%
 
u19603.6%
 
d16803.1%
 
c16072.9%
 
p11812.2%
 
g10852.0%
 
v10832.0%
 
m8631.6%
 
y7291.3%
 
b7271.3%
 
k5551.0%
 
h4820.9%
 
z3510.6%
 
x3410.6%
 
f2500.5%
 
q1270.2%
 
w1210.2%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1152100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/6695.7%
 
.34.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin7321284.8%
 
Common1312715.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
383829.2%
 
0287221.9%
 
5143210.9%
 
-11528.8%
 
110077.7%
 
37115.4%
 
64293.3%
 
44253.2%
 
73592.7%
 
93282.5%
 
23272.5%
 
81781.4%
 
/660.5%
 
.3< 0.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a70369.6%
 
e68119.3%
 
r61738.4%
 
o44496.1%
 
i40925.6%
 
n37725.2%
 
t35714.9%
 
s28533.9%
 
l28213.9%
 
S27693.8%
 
C25833.5%
 
u19602.7%
 
d16802.3%
 
c16072.2%
 
T15832.2%
 
p11811.6%
 
R11301.5%
 
g10851.5%
 
v10831.5%
 
X10471.4%
 
E9561.3%
 
G8941.2%
 
m8631.2%
 
A8411.1%
 
M7661.0%
 
Other values (27)960613.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII86339100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a70368.1%
 
e68117.9%
 
r61737.1%
 
o44495.2%
 
i40924.7%
 
38384.4%
 
n37724.4%
 
t35714.1%
 
028723.3%
 
s28533.3%
 
l28213.3%
 
S27693.2%
 
C25833.0%
 
u19602.3%
 
d16801.9%
 
c16071.9%
 
T15831.8%
 
514321.7%
 
p11811.4%
 
-11521.3%
 
R11301.3%
 
g10851.3%
 
v10831.3%
 
X10471.2%
 
110071.2%
 
Other values (41)1675219.4%
 

Year
Real number (ℝ≥0)

Distinct28
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.384338
Minimum1990
Maximum2017
Zeros0
Zeros (%)0.0%
Memory size93.2 KiB

Quantile statistics

Minimum1990
5-th percentile1993
Q12007
median2015
Q32016
95-th percentile2017
Maximum2017
Range27
Interquartile range (IQR)9

Descriptive statistics

Standard deviation7.579739888
Coefficient of variation (CV)0.003770293941
Kurtosis0.2998384857
Mean2010.384338
Median Absolute Deviation (MAD)2
Skewness-1.221980746
Sum23951719
Variance57.45245676
MonotocityNot monotonic
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%) 
2015217018.2%
 
2016215718.1%
 
2017166814.0%
 
20145894.9%
 
20123873.2%
 
20093793.2%
 
20133663.1%
 
20083492.9%
 
20073452.9%
 
20102982.5%
 
20112852.4%
 
20032382.0%
 
20042352.0%
 
20052131.8%
 
19932091.8%
 
20022051.7%
 
20062051.7%
 
19921771.5%
 
19971751.5%
 
20011681.4%
 
19941631.4%
 
19981541.3%
 
19911521.3%
 
19951351.1%
 
19961311.1%
 
Other values (3)3613.0%
 
ValueCountFrequency (%) 
19901231.0%
 
19911521.3%
 
19921771.5%
 
19932091.8%
 
19941631.4%
 
19951351.1%
 
19961311.1%
 
19971751.5%
 
19981541.3%
 
19991201.0%
 
ValueCountFrequency (%) 
2017166814.0%
 
2016215718.1%
 
2015217018.2%
 
20145894.9%
 
20133663.1%
 
20123873.2%
 
20112852.4%
 
20102982.5%
 
20093793.2%
 
20083492.9%
 

Engine Fuel Type
Categorical

Distinct10
Distinct (%)0.1%
Missing3
Missing (%)< 0.1%
Memory size93.2 KiB
regular unleaded
7172 
premium unleaded (required)
2009 
premium unleaded (recommended)
1523 
flex-fuel (unleaded/E85)
899 
diesel
 
154
Other values (5)
 
154
ValueCountFrequency (%) 
regular unleaded717260.2%
 
premium unleaded (required)200916.9%
 
premium unleaded (recommended)152312.8%
 
flex-fuel (unleaded/E85)8997.5%
 
diesel1541.3%
 
electric660.6%
 
flex-fuel (premium unleaded required/E85)540.5%
 
flex-fuel (premium unleaded recommended/E85)260.2%
 
flex-fuel (unleaded/natural gas)60.1%
 
natural gas2< 0.1%
 
(Missing)3< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length44
Median length16
Mean length20.25297969
Min length3

Overview of Unicode Properties

Unique unicode characters26
Unique unicode categories8 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e4534518.8%
 
d2869311.9%
 
u2552910.6%
 
r237059.8%
 
l210598.7%
 
a188887.8%
 
153896.4%
 
n132525.5%
 
m103224.3%
 
g71803.0%
 
i58952.4%
 
(45171.9%
 
)45171.9%
 
p36121.5%
 
q20630.9%
 
f19700.8%
 
c16810.7%
 
o15490.6%
 
x9850.4%
 
-9850.4%
 
/9850.4%
 
E9790.4%
 
89790.4%
 
59790.4%
 
s1620.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter21196487.8%
 
Space Separator153896.4%
 
Open Punctuation45171.9%
 
Close Punctuation45171.9%
 
Decimal Number19580.8%
 
Dash Punctuation9850.4%
 
Other Punctuation9850.4%
 
Uppercase Letter9790.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e4534521.4%
 
d2869313.5%
 
u2552912.0%
 
r2370511.2%
 
l210599.9%
 
a188888.9%
 
n132526.3%
 
m103224.9%
 
g71803.4%
 
i58952.8%
 
p36121.7%
 
q20631.0%
 
f19700.9%
 
c16810.8%
 
o15490.7%
 
x9850.5%
 
s1620.1%
 
t74< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
15389100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(4517100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)4517100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-985100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/985100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
E979100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
897950.0%
 
597950.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin21294388.3%
 
Common2835111.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e4534521.3%
 
d2869313.5%
 
u2552912.0%
 
r2370511.1%
 
l210599.9%
 
a188888.9%
 
n132526.2%
 
m103224.8%
 
g71803.4%
 
i58952.8%
 
p36121.7%
 
q20631.0%
 
f19700.9%
 
c16810.8%
 
o15490.7%
 
x9850.5%
 
E9790.5%
 
s1620.1%
 
t74< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
1538954.3%
 
(451715.9%
 
)451715.9%
 
-9853.5%
 
/9853.5%
 
89793.5%
 
59793.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII241294100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e4534518.8%
 
d2869311.9%
 
u2552910.6%
 
r237059.8%
 
l210598.7%
 
a188887.8%
 
153896.4%
 
n132525.5%
 
m103224.3%
 
g71803.0%
 
i58952.4%
 
(45171.9%
 
)45171.9%
 
p36121.5%
 
q20630.9%
 
f19700.8%
 
c16810.7%
 
o15490.6%
 
x9850.4%
 
-9850.4%
 
/9850.4%
 
E9790.4%
 
89790.4%
 
59790.4%
 
s1620.1%
 

Engine HP
Real number (ℝ≥0)

Distinct356
Distinct (%)3.0%
Missing69
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean249.3860701
Minimum55
Maximum1001
Zeros0
Zeros (%)0.0%
Memory size93.2 KiB

Quantile statistics

Minimum55
5-th percentile118
Q1170
median227
Q3300
95-th percentile460
Maximum1001
Range946
Interquartile range (IQR)130

Descriptive statistics

Standard deviation109.1918703
Coefficient of variation (CV)0.4378427
Kurtosis2.323884426
Mean249.3860701
Median Absolute Deviation (MAD)63
Skewness1.291229813
Sum2953978
Variance11922.86453
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2004563.8%
 
1703512.9%
 
2103202.7%
 
2402682.2%
 
2852462.1%
 
1852412.0%
 
1502321.9%
 
1381991.7%
 
3001921.6%
 
2201711.4%
 
3551581.3%
 
1551561.3%
 
3811521.3%
 
1601461.2%
 
1401431.2%
 
1801351.1%
 
2601331.1%
 
4201331.1%
 
1901321.1%
 
2901321.1%
 
3101231.0%
 
2751231.0%
 
2501201.0%
 
2801201.0%
 
2611000.8%
 
Other values (331)716360.1%
 
ValueCountFrequency (%) 
552< 0.1%
 
622< 0.1%
 
63130.1%
 
6670.1%
 
7390.1%
 
74180.2%
 
7880.1%
 
79120.1%
 
81190.2%
 
825< 0.1%
 
ValueCountFrequency (%) 
10013< 0.1%
 
7502< 0.1%
 
7313< 0.1%
 
7204< 0.1%
 
70760.1%
 
70060.1%
 
6701< 0.1%
 
6624< 0.1%
 
6611< 0.1%
 
6601< 0.1%
 

Engine Cylinders
Real number (ℝ≥0)

Distinct9
Distinct (%)0.1%
Missing30
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean5.628828677
Minimum0
Maximum16
Zeros56
Zeros (%)0.5%
Memory size93.2 KiB

Quantile statistics

Minimum0
5-th percentile4
Q14
median6
Q36
95-th percentile8
Maximum16
Range16
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.780559348
Coefficient of variation (CV)0.3163285739
Kurtosis1.974315725
Mean5.628828677
Median Absolute Deviation (MAD)2
Skewness0.9645421532
Sum66893
Variance3.170391593
MonotocityNot monotonic
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
4475239.9%
 
6448937.7%
 
8203117.0%
 
122301.9%
 
52251.9%
 
10680.6%
 
0560.5%
 
3300.3%
 
163< 0.1%
 
(Missing)300.3%
 
ValueCountFrequency (%) 
0560.5%
 
3300.3%
 
4475239.9%
 
52251.9%
 
6448937.7%
 
8203117.0%
 
10680.6%
 
122301.9%
 
163< 0.1%
 
ValueCountFrequency (%) 
163< 0.1%
 
122301.9%
 
10680.6%
 
8203117.0%
 
6448937.7%
 
52251.9%
 
4475239.9%
 
3300.3%
 
0560.5%
 
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size93.2 KiB
AUTOMATIC
8266 
MANUAL
2935 
AUTOMATED_MANUAL
 
626
DIRECT_DRIVE
 
68
UNKNOWN
 
19
ValueCountFrequency (%) 
AUTOMATIC826669.4%
 
MANUAL293524.6%
 
AUTOMATED_MANUAL6265.3%
 
DIRECT_DRIVE680.6%
 
UNKNOWN190.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length16
Median length9
Mean length8.642689273
Min length6

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories2 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
A2490624.2%
 
T1785217.3%
 
U1247212.1%
 
M1245312.1%
 
O89118.7%
 
I84028.2%
 
C83348.1%
 
N36183.5%
 
L35613.5%
 
E7620.7%
 
D7620.7%
 
_6940.7%
 
R1360.1%
 
V680.1%
 
K19< 0.1%
 
W19< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter10227599.3%
 
Connector Punctuation6940.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A2490624.4%
 
T1785217.5%
 
U1247212.2%
 
M1245312.2%
 
O89118.7%
 
I84028.2%
 
C83348.1%
 
N36183.5%
 
L35613.5%
 
E7620.7%
 
D7620.7%
 
R1360.1%
 
V680.1%
 
K19< 0.1%
 
W19< 0.1%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_694100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin10227599.3%
 
Common6940.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A2490624.4%
 
T1785217.5%
 
U1247212.2%
 
M1245312.2%
 
O89118.7%
 
I84028.2%
 
C83348.1%
 
N36183.5%
 
L35613.5%
 
E7620.7%
 
D7620.7%
 
R1360.1%
 
V680.1%
 
K19< 0.1%
 
W19< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
_694100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII102969100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
A2490624.2%
 
T1785217.3%
 
U1247212.1%
 
M1245312.1%
 
O89118.7%
 
I84028.2%
 
C83348.1%
 
N36183.5%
 
L35613.5%
 
E7620.7%
 
D7620.7%
 
_6940.7%
 
R1360.1%
 
V680.1%
 
K19< 0.1%
 
W19< 0.1%
 

Driven_Wheels
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size93.2 KiB
front wheel drive
4787 
rear wheel drive
3371 
all wheel drive
2353 
four wheel drive
1403 
ValueCountFrequency (%) 
front wheel drive478740.2%
 
rear wheel drive337128.3%
 
all wheel drive235319.7%
 
four wheel drive140311.8%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length17
Median length16
Mean length16.20429747
Min length15

Overview of Unicode Properties

Unique unicode characters15
Unique unicode categories2 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e3911320.3%
 
r2484612.9%
 
2382812.3%
 
l166208.6%
 
w119146.2%
 
h119146.2%
 
d119146.2%
 
i119146.2%
 
v119146.2%
 
f61903.2%
 
o61903.2%
 
a57243.0%
 
n47872.5%
 
t47872.5%
 
u14030.7%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter16923087.7%
 
Space Separator2382812.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e3911323.1%
 
r2484614.7%
 
l166209.8%
 
w119147.0%
 
h119147.0%
 
d119147.0%
 
i119147.0%
 
v119147.0%
 
f61903.7%
 
o61903.7%
 
a57243.4%
 
n47872.8%
 
t47872.8%
 
u14030.8%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
23828100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin16923087.7%
 
Common2382812.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e3911323.1%
 
r2484614.7%
 
l166209.8%
 
w119147.0%
 
h119147.0%
 
d119147.0%
 
i119147.0%
 
v119147.0%
 
f61903.7%
 
o61903.7%
 
a57243.4%
 
n47872.8%
 
t47872.8%
 
u14030.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
23828100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII193058100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e3911320.3%
 
r2484612.9%
 
2382812.3%
 
l166208.6%
 
w119146.2%
 
h119146.2%
 
d119146.2%
 
i119146.2%
 
v119146.2%
 
f61903.2%
 
o61903.2%
 
a57243.0%
 
n47872.5%
 
t47872.5%
 
u14030.7%
 

Number of Doors
Categorical

Distinct3
Distinct (%)< 0.1%
Missing6
Missing (%)0.1%
Memory size93.2 KiB
4
8353 
2
3160 
3
 
395
ValueCountFrequency (%) 
4835370.1%
 
2316026.5%
 
33953.3%
 
(Missing)60.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

Overview of Unicode Properties

Unique unicode characters7
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
.1190833.3%
 
01190833.3%
 
4835323.4%
 
231608.8%
 
33951.1%
 
n12< 0.1%
 
a6< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2381666.6%
 
Other Punctuation1190833.3%
 
Lowercase Letter180.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
01190850.0%
 
4835335.1%
 
2316013.3%
 
33951.7%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.11908100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n1266.7%
 
a633.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common3572499.9%
 
Latin180.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
.1190833.3%
 
01190833.3%
 
4835323.4%
 
231608.8%
 
33951.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n1266.7%
 
a633.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII35742100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
.1190833.3%
 
01190833.3%
 
4835323.4%
 
231608.8%
 
33951.1%
 
n12< 0.1%
 
a6< 0.1%
 

Market Category
Categorical

HIGH CARDINALITY
MISSING

Distinct71
Distinct (%)0.9%
Missing3742
Missing (%)31.4%
Memory size93.2 KiB
Crossover
1110 
Flex Fuel
872 
Luxury
855 
Luxury,Performance
673 
Hatchback
641 
Other values (66)
4021 
ValueCountFrequency (%) 
Crossover11109.3%
 
Flex Fuel8727.3%
 
Luxury8557.2%
 
Luxury,Performance6735.6%
 
Hatchback6415.4%
 
Performance6015.0%
 
Crossover,Luxury4103.4%
 
Luxury,High-Performance3342.8%
 
Exotic,High-Performance2612.2%
 
Hatchback,Performance2522.1%
 
Factory Tuner,Luxury,High-Performance2151.8%
 
High-Performance1991.7%
 
Hybrid1231.0%
 
Crossover,Luxury,Performance1130.9%
 
Factory Tuner,High-Performance1060.9%
 
Factory Tuner,Performance920.8%
 
Flex Fuel,Performance870.7%
 
Diesel840.7%
 
Exotic,Luxury,High-Performance790.7%
 
Hatchback,Hybrid720.6%
 
Crossover,Hatchback720.6%
 
Crossover,Performance690.6%
 
Crossover,Flex Fuel640.5%
 
Exotic,Factory Tuner,Luxury,High-Performance520.4%
 
Luxury,Hybrid520.4%
 
Other values (46)6845.7%
 
(Missing)374231.4%
 
Frequencies of value counts

Unique

Unique5 ?
Unique (%)0.1%
Histogram of lengths of the category

Length

Max length54
Median length9
Mean length11.37141178
Min length3

Overview of Unicode Properties

Unique unicode characters32
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
r1594711.8%
 
e124639.2%
 
n116068.6%
 
a102807.6%
 
o86916.4%
 
u83946.2%
 
c70395.2%
 
x49863.7%
 
,47313.5%
 
s42743.2%
 
y42563.1%
 
P35012.6%
 
f35012.6%
 
m35012.6%
 
L32882.4%
 
F30152.2%
 
H29422.2%
 
l26001.9%
 
h25951.9%
 
i24411.8%
 
t23301.7%
 
C20341.5%
 
v20341.5%
 
18181.3%
 
b15551.1%
 
Other values (7)56574.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter11143582.3%
 
Uppercase Letter1610811.9%
 
Other Punctuation47313.5%
 
Space Separator18181.3%
 
Dash Punctuation13871.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
P350121.7%
 
L328820.4%
 
F301518.7%
 
H294218.3%
 
C203412.6%
 
T6213.9%
 
E5013.1%
 
D2061.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r1594714.3%
 
e1246311.2%
 
n1160610.4%
 
a102809.2%
 
o86917.8%
 
u83947.5%
 
c70396.3%
 
x49864.5%
 
s42743.8%
 
y42563.8%
 
f35013.1%
 
m35013.1%
 
l26002.3%
 
h25952.3%
 
i24412.2%
 
t23302.1%
 
v20341.8%
 
b15551.4%
 
g13871.2%
 
k12081.1%
 
d3470.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1818100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,4731100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1387100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin12754394.1%
 
Common79365.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r1594712.5%
 
e124639.8%
 
n116069.1%
 
a102808.1%
 
o86916.8%
 
u83946.6%
 
c70395.5%
 
x49863.9%
 
s42743.4%
 
y42563.3%
 
P35012.7%
 
f35012.7%
 
m35012.7%
 
L32882.6%
 
F30152.4%
 
H29422.3%
 
l26002.0%
 
h25952.0%
 
i24411.9%
 
t23301.8%
 
C20341.6%
 
v20341.6%
 
b15551.2%
 
g13871.1%
 
k12080.9%
 
Other values (4)16751.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
,473159.6%
 
181822.9%
 
-138717.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII135479100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
r1594711.8%
 
e124639.2%
 
n116068.6%
 
a102807.6%
 
o86916.4%
 
u83946.2%
 
c70395.2%
 
x49863.7%
 
,47313.5%
 
s42743.2%
 
y42563.1%
 
P35012.6%
 
f35012.6%
 
m35012.6%
 
L32882.4%
 
F30152.2%
 
H29422.2%
 
l26001.9%
 
h25951.9%
 
i24411.8%
 
t23301.7%
 
C20341.5%
 
v20341.5%
 
18181.3%
 
b15551.1%
 
Other values (7)56574.2%
 

Vehicle Size
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size93.2 KiB
Compact
4764 
Midsize
4373 
Large
2777 
ValueCountFrequency (%) 
Compact476440.0%
 
Midsize437336.7%
 
Large277723.3%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length6.533825751
Min length5

Overview of Unicode Properties

Unique unicode characters16
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
i874611.2%
 
a75419.7%
 
e71509.2%
 
C47646.1%
 
o47646.1%
 
m47646.1%
 
p47646.1%
 
c47646.1%
 
t47646.1%
 
M43735.6%
 
d43735.6%
 
s43735.6%
 
z43735.6%
 
L27773.6%
 
r27773.6%
 
g27773.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter6593084.7%
 
Uppercase Letter1191415.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C476440.0%
 
M437336.7%
 
L277723.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i874613.3%
 
a754111.4%
 
e715010.8%
 
o47647.2%
 
m47647.2%
 
p47647.2%
 
c47647.2%
 
t47647.2%
 
d43736.6%
 
s43736.6%
 
z43736.6%
 
r27774.2%
 
g27774.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin77844100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
i874611.2%
 
a75419.7%
 
e71509.2%
 
C47646.1%
 
o47646.1%
 
m47646.1%
 
p47646.1%
 
c47646.1%
 
t47646.1%
 
M43735.6%
 
d43735.6%
 
s43735.6%
 
z43735.6%
 
L27773.6%
 
r27773.6%
 
g27773.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII77844100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
i874611.2%
 
a75419.7%
 
e71509.2%
 
C47646.1%
 
o47646.1%
 
m47646.1%
 
p47646.1%
 
c47646.1%
 
t47646.1%
 
M43735.6%
 
d43735.6%
 
s43735.6%
 
z43735.6%
 
L27773.6%
 
r27773.6%
 
g27773.6%
 

Vehicle Style
Categorical

Distinct16
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size93.2 KiB
Sedan
3048 
4dr SUV
2488 
Coupe
1211 
Convertible
793 
4dr Hatchback
702 
Other values (11)
3672 
ValueCountFrequency (%) 
Sedan304825.6%
 
4dr SUV248820.9%
 
Coupe121110.2%
 
Convertible7936.7%
 
4dr Hatchback7025.9%
 
Crew Cab Pickup6815.7%
 
Extended Cab Pickup6235.2%
 
Wagon5925.0%
 
2dr Hatchback5064.2%
 
Passenger Minivan4173.5%
 
Regular Cab Pickup3923.3%
 
2dr SUV1381.2%
 
Passenger Van1281.1%
 
Cargo Van950.8%
 
Cargo Minivan710.6%
 
Convertible SUV290.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length19
Median length7
Mean length8.992613732
Min length5

Overview of Unicode Properties

Unique unicode characters33
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a95668.9%
 
e93128.7%
 
d81287.6%
 
79667.4%
 
n68296.4%
 
r64406.0%
 
S57035.3%
 
C45764.3%
 
c41123.8%
 
b37263.5%
 
i34943.3%
 
u32993.1%
 
431903.0%
 
p29072.7%
 
k29042.7%
 
V28782.7%
 
o27912.6%
 
U26552.5%
 
t26532.5%
 
P22412.1%
 
g16951.6%
 
v13101.2%
 
l12141.1%
 
H12081.1%
 
h12081.1%
 
Other values (8)51334.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter7398269.1%
 
Uppercase Letter2135619.9%
 
Space Separator79667.4%
 
Decimal Number38343.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S570326.7%
 
C457621.4%
 
V287813.5%
 
U265512.4%
 
P224110.5%
 
H12085.7%
 
E6232.9%
 
W5922.8%
 
M4882.3%
 
R3921.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a956612.9%
 
e931212.6%
 
d812811.0%
 
n68299.2%
 
r64408.7%
 
c41125.6%
 
b37265.0%
 
i34944.7%
 
u32994.5%
 
p29073.9%
 
k29043.9%
 
o27913.8%
 
t26533.6%
 
g16952.3%
 
v13101.8%
 
l12141.6%
 
h12081.6%
 
s10901.5%
 
w6810.9%
 
x6230.8%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
4319083.2%
 
264416.8%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
7966100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin9533889.0%
 
Common1180011.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a956610.0%
 
e93129.8%
 
d81288.5%
 
n68297.2%
 
r64406.8%
 
S57036.0%
 
C45764.8%
 
c41124.3%
 
b37263.9%
 
i34943.7%
 
u32993.5%
 
p29073.0%
 
k29043.0%
 
V28783.0%
 
o27912.9%
 
U26552.8%
 
t26532.8%
 
P22412.4%
 
g16951.8%
 
v13101.4%
 
l12141.3%
 
H12081.3%
 
h12081.3%
 
s10901.1%
 
w6810.7%
 
Other values (5)27182.9%
 

Most frequent Common characters

ValueCountFrequency (%) 
796667.5%
 
4319027.0%
 
26445.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII107138100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a95668.9%
 
e93128.7%
 
d81287.6%
 
79667.4%
 
n68296.4%
 
r64406.0%
 
S57035.3%
 
C45764.3%
 
c41123.8%
 
b37263.5%
 
i34943.3%
 
u32993.1%
 
431903.0%
 
p29072.7%
 
k29042.7%
 
V28782.7%
 
o27912.6%
 
U26552.5%
 
t26532.5%
 
P22412.1%
 
g16951.6%
 
v13101.2%
 
l12141.1%
 
H12081.1%
 
h12081.1%
 
Other values (8)51334.8%
 

highway MPG
Real number (ℝ≥0)

Distinct59
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.63748531
Minimum12
Maximum354
Zeros0
Zeros (%)0.0%
Memory size93.2 KiB

Quantile statistics

Minimum12
5-th percentile17
Q122
median26
Q330
95-th percentile38
Maximum354
Range342
Interquartile range (IQR)8

Descriptive statistics

Standard deviation8.863000767
Coefficient of variation (CV)0.3327266318
Kurtosis181.6273643
Mean26.63748531
Median Absolute Deviation (MAD)4
Skewness7.573931189
Sum317359
Variance78.5527826
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
248767.4%
 
238016.7%
 
267786.5%
 
227536.3%
 
257316.1%
 
286825.7%
 
275854.9%
 
315684.8%
 
305474.6%
 
205154.3%
 
215144.3%
 
195084.3%
 
294934.1%
 
174133.5%
 
334013.4%
 
183903.3%
 
323402.9%
 
342872.4%
 
352221.9%
 
362221.9%
 
162201.8%
 
371841.5%
 
381541.3%
 
151291.1%
 
391100.9%
 
Other values (34)4914.1%
 
ValueCountFrequency (%) 
1270.1%
 
13250.2%
 
14420.4%
 
151291.1%
 
162201.8%
 
174133.5%
 
183903.3%
 
195084.3%
 
205154.3%
 
215144.3%
 
ValueCountFrequency (%) 
3541< 0.1%
 
1113< 0.1%
 
1102< 0.1%
 
10960.1%
 
1081< 0.1%
 
1072< 0.1%
 
1062< 0.1%
 
10580.1%
 
1032< 0.1%
 
1023< 0.1%
 

city mpg
Real number (ℝ≥0)

Distinct69
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.73325499
Minimum7
Maximum137
Zeros0
Zeros (%)0.0%
Memory size93.2 KiB

Quantile statistics

Minimum7
5-th percentile12
Q116
median18
Q322
95-th percentile29
Maximum137
Range130
Interquartile range (IQR)6

Descriptive statistics

Standard deviation8.98779816
Coefficient of variation (CV)0.4554645528
Kurtosis74.6234866
Mean19.73325499
Median Absolute Deviation (MAD)3
Skewness7.106680706
Sum235102
Variance80.78051577
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
17123010.3%
 
1611069.3%
 
1510388.7%
 
189978.4%
 
198417.1%
 
207956.7%
 
216385.4%
 
146345.3%
 
226345.3%
 
136175.2%
 
235004.2%
 
244663.9%
 
254443.7%
 
123022.5%
 
272602.2%
 
112251.9%
 
262161.8%
 
281761.5%
 
301301.1%
 
311221.0%
 
291010.8%
 
10830.7%
 
9330.3%
 
41290.2%
 
32220.2%
 
Other values (44)2752.3%
 
ValueCountFrequency (%) 
71< 0.1%
 
890.1%
 
9330.3%
 
10830.7%
 
112251.9%
 
123022.5%
 
136175.2%
 
146345.3%
 
1510388.7%
 
1611069.3%
 
ValueCountFrequency (%) 
1373< 0.1%
 
1322< 0.1%
 
1291< 0.1%
 
12880.1%
 
126140.1%
 
1242< 0.1%
 
1221< 0.1%
 
1213< 0.1%
 
1205< 0.1%
 
1103< 0.1%
 

Popularity
Real number (ℝ≥0)

Distinct48
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1554.911197
Minimum2
Maximum5657
Zeros0
Zeros (%)0.0%
Memory size93.2 KiB

Quantile statistics

Minimum2
5-th percentile155
Q1549
median1385
Q32009
95-th percentile5657
Maximum5657
Range5655
Interquartile range (IQR)1460

Descriptive statistics

Standard deviation1441.855347
Coefficient of variation (CV)0.9272911209
Kurtosis2.277089663
Mean1554.911197
Median Absolute Deviation (MAD)768
Skewness1.653442529
Sum18525212
Variance2078946.841
MonotocityNot monotonic
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%) 
138511239.4%
 
56578817.4%
 
8738096.8%
 
20317466.3%
 
18516265.3%
 
20095584.7%
 
5495154.3%
 
22024493.8%
 
5864233.6%
 
16243973.3%
 
6173533.0%
 
4813512.9%
 
39163342.8%
 
1903302.8%
 
31053282.8%
 
14393032.5%
 
8702812.4%
 
6402562.1%
 
2042522.1%
 
17202311.9%
 
4362131.8%
 
4542021.7%
 
1551961.6%
 
10131871.6%
 
2101861.6%
 
Other values (23)138411.6%
 
ValueCountFrequency (%) 
23< 0.1%
 
213< 0.1%
 
261501.3%
 
611641.4%
 
67160.1%
 
86310.3%
 
105600.5%
 
1135< 0.1%
 
130170.1%
 
1551961.6%
 
ValueCountFrequency (%) 
56578817.4%
 
39163342.8%
 
31053282.8%
 
2774690.6%
 
22024493.8%
 
20317466.3%
 
20095584.7%
 
18516265.3%
 
17202311.9%
 
17151361.1%
 

MSRP
Real number (ℝ≥0)

Distinct6049
Distinct (%)50.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40594.73703
Minimum2000
Maximum2065902
Zeros0
Zeros (%)0.0%
Memory size93.2 KiB

Quantile statistics

Minimum2000
5-th percentile2000
Q121000
median29995
Q342231.25
95-th percentile107530.25
Maximum2065902
Range2063902
Interquartile range (IQR)21231.25

Descriptive statistics

Standard deviation60109.1036
Coefficient of variation (CV)1.480711737
Kurtosis268.9262765
Mean40594.73703
Median Absolute Deviation (MAD)10295
Skewness11.77198713
Sum483645697
Variance3613104336
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
200010368.7%
 
25995190.2%
 
29995190.2%
 
20995160.1%
 
27995160.1%
 
23995150.1%
 
30995150.1%
 
24995140.1%
 
21995140.1%
 
26195130.1%
 
22995120.1%
 
27295120.1%
 
39995110.1%
 
49800110.1%
 
29395100.1%
 
40950100.1%
 
25095100.1%
 
22300100.1%
 
43950100.1%
 
32395100.1%
 
24895100.1%
 
31790100.1%
 
3199590.1%
 
4830090.1%
 
1599590.1%
 
Other values (6024)1058488.8%
 
ValueCountFrequency (%) 
200010368.7%
 
20021< 0.1%
 
20031< 0.1%
 
20082< 0.1%
 
20121< 0.1%
 
20131< 0.1%
 
20191< 0.1%
 
20271< 0.1%
 
20281< 0.1%
 
20321< 0.1%
 
ValueCountFrequency (%) 
20659021< 0.1%
 
17057691< 0.1%
 
15000002< 0.1%
 
13827501< 0.1%
 
13800001< 0.1%
 
6433301< 0.1%
 
5488002< 0.1%
 
5355001< 0.1%
 
5065001< 0.1%
 
4976502< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

MakeModelYearEngine Fuel TypeEngine HPEngine CylindersTransmission TypeDriven_WheelsNumber of DoorsMarket CategoryVehicle SizeVehicle Stylehighway MPGcity mpgPopularityMSRP
0BMW1 Series M2011premium unleaded (required)335.06.0MANUALrear wheel drive2.0Factory Tuner,Luxury,High-PerformanceCompactCoupe2619391646135
1BMW1 Series2011premium unleaded (required)300.06.0MANUALrear wheel drive2.0Luxury,PerformanceCompactConvertible2819391640650
2BMW1 Series2011premium unleaded (required)300.06.0MANUALrear wheel drive2.0Luxury,High-PerformanceCompactCoupe2820391636350
3BMW1 Series2011premium unleaded (required)230.06.0MANUALrear wheel drive2.0Luxury,PerformanceCompactCoupe2818391629450
4BMW1 Series2011premium unleaded (required)230.06.0MANUALrear wheel drive2.0LuxuryCompactConvertible2818391634500
5BMW1 Series2012premium unleaded (required)230.06.0MANUALrear wheel drive2.0Luxury,PerformanceCompactCoupe2818391631200
6BMW1 Series2012premium unleaded (required)300.06.0MANUALrear wheel drive2.0Luxury,PerformanceCompactConvertible2617391644100
7BMW1 Series2012premium unleaded (required)300.06.0MANUALrear wheel drive2.0Luxury,High-PerformanceCompactCoupe2820391639300
8BMW1 Series2012premium unleaded (required)230.06.0MANUALrear wheel drive2.0LuxuryCompactConvertible2818391636900
9BMW1 Series2013premium unleaded (required)230.06.0MANUALrear wheel drive2.0LuxuryCompactConvertible2718391637200

Last rows

MakeModelYearEngine Fuel TypeEngine HPEngine CylindersTransmission TypeDriven_WheelsNumber of DoorsMarket CategoryVehicle SizeVehicle Stylehighway MPGcity mpgPopularityMSRP
11904BMWZ82002premium unleaded (required)394.08.0MANUALrear wheel drive2.0Exotic,Luxury,High-PerformanceCompactConvertible19123916130000
11905BMWZ82003premium unleaded (required)394.08.0MANUALrear wheel drive2.0Exotic,Luxury,High-PerformanceCompactConvertible19123916131500
11906AcuraZDX2011premium unleaded (required)300.06.0AUTOMATICall wheel drive4.0Crossover,Hatchback,LuxuryMidsize4dr Hatchback231620446020
11907AcuraZDX2011premium unleaded (required)300.06.0AUTOMATICall wheel drive4.0Crossover,Hatchback,LuxuryMidsize4dr Hatchback231620456570
11908AcuraZDX2011premium unleaded (required)300.06.0AUTOMATICall wheel drive4.0Crossover,Hatchback,LuxuryMidsize4dr Hatchback231620450520
11909AcuraZDX2012premium unleaded (required)300.06.0AUTOMATICall wheel drive4.0Crossover,Hatchback,LuxuryMidsize4dr Hatchback231620446120
11910AcuraZDX2012premium unleaded (required)300.06.0AUTOMATICall wheel drive4.0Crossover,Hatchback,LuxuryMidsize4dr Hatchback231620456670
11911AcuraZDX2012premium unleaded (required)300.06.0AUTOMATICall wheel drive4.0Crossover,Hatchback,LuxuryMidsize4dr Hatchback231620450620
11912AcuraZDX2013premium unleaded (recommended)300.06.0AUTOMATICall wheel drive4.0Crossover,Hatchback,LuxuryMidsize4dr Hatchback231620450920
11913LincolnZephyr2006regular unleaded221.06.0AUTOMATICfront wheel drive4.0LuxuryMidsizeSedan26176128995

Duplicate rows

Most frequent

MakeModelYearEngine Fuel TypeEngine HPEngine CylindersTransmission TypeDriven_WheelsNumber of DoorsMarket CategoryVehicle SizeVehicle Stylehighway MPGcity mpgPopularityMSRPcount
96Land RoverRange Rover Evoque2015premium unleaded (recommended)240.04.0AUTOMATICall wheel drive4.0Crossover,LuxuryCompact4dr SUV3021258483004
217VolkswagenCC2016premium unleaded (recommended)200.04.0AUTOMATED_MANUALfront wheel drive4.0PerformanceMidsizeSedan3122873378204
221VolkswagenCC2017premium unleaded (recommended)200.04.0AUTOMATED_MANUALfront wheel drive4.0PerformanceMidsizeSedan3122873378204
300VolkswagenNew Beetle2008regular unleaded150.05.0AUTOMATICfront wheel drive2.0HatchbackCompact2dr Hatchback2920873209904
305VolkswagenNew Beetle2010regular unleaded150.05.0AUTOMATICfront wheel drive2.0HatchbackCompact2dr Hatchback2920873203904
8Audi1001992regular unleaded172.06.0MANUALfront wheel drive4.0LuxuryMidsizeSedan2417310520003
9Audi1001993regular unleaded172.06.0MANUALfront wheel drive4.0LuxuryMidsizeSedan2417310520003
84HyundaiVeloster2014regular unleaded132.04.0AUTOMATED_MANUALfront wheel drive3.0HatchbackCompact2dr Hatchback36281439190503
86HyundaiVeloster2014regular unleaded138.04.0MANUALfront wheel drive3.0HatchbackCompact2dr Hatchback35261439178003
222VolkswagenGLI2008premium unleaded (required)200.04.0AUTOMATED_MANUALfront wheel drive4.0PerformanceCompactSedan2922873253753